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Abstract 

We study a sensor node with an energy harvesting source. The generated energy can be stored in a buffer 
The sensor node periodically senses a random field and generates a packet. These packets are stored in a queue 
and transmitted using the energy available at that time. We obtain energy management policies that are throughput 
optimal, i.e., the data queue stays stable for the largest possible data rate. Next we obtain energy management 
policies which minimize the mean delay in the queue. We also compare performance of several easily implementable 
sub-optimal energy management policies. A greedy policy is identified which, in low SNR regime, is throughput 
optimal and also minimizes mean delay. 

Keywords: Optimal energy management policies, energy harvesting, sensor networks. 

I. Introduction 

Sensor networks consist of a large number of small, inexpensive sensor nodes. These nodes have small 
batteries with limited power and also have limited computational power and storage space. When the 
battery of a node is exhausted, it is not replaced and the node dies. When sufficient number of nodes 
die, the network may not be able to perform its designated task. Thus the life time of a network is an 
important characteristic of a sensor network ([4]) and it is tied up with the life time of a node. 

Various studies have been conducted to increase the life time of the battery of a node by reducing 
the energy intensive tasks, e.g., reducing the number of bits to transmit ([22], [5]), making a node to go 
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into power saving modes: (sleep/listen) periodically ([28]), using energy efficient routing ([30], [25]) and 
MAC ([31]). Studies that estimate the life time of a sensor network include [25]. A general survey on 
sensor networks is [1] which provides many more references on these issues. 

In this paper we focus on increasing the life time of the battery itself by energy harvesting techniques 
([14], [21]). Common energy harvesting devices are solar cells, wind turbines and piezo-electric cells, 
which extract energy from the environment. Among these, solar harvesting energy through photo-voltaic 
effect seems to have emerged as a technology of choice for many sensor nodes ([21], [23]). Unlike for 
a battery operated sensor node, now there is potentially an infinite amount of energy available to the 
node. Hence energy conservation need not be the dominant theme. Rather, the issues involved in a node 
with an energy harvesting source can be quite different. The source of energy and the energy harvesting 
device may be such that the energy cannot be generated at all times (e.g., a solar cell). However one 
may want to use the sensor nodes at such times also. Furthermore the rate of generation of energy can 
be limited. Thus one may want to match the energy generation profile of the harvesting source with the 
energy consumption profile of the sensor node. If the energy can be stored in the sensor node then this 
matching can be considerably simplified. But the energy storage device may have limited capacity. Thus, 
one may also need to modify the energy consumption profile of the sensor node so as to achieve the 
desired objectives with the given energy harvesting source. It should be done in such a way that the node 
can perform satisfactorily for a long time, i.e., energy starvation at least, should not be the reason for the 
node to die. In [14] such an energy/power management scheme is called energy neutral operation (if the 
energy harvesting source is the only energy source at the node, e.g., the node has no battery). Also, in a 
sensor network, the routing and relaying of data through the network may need to be suitably modified 
to match the energy generation profiles of different nodes, which may vary with the nodes. 

In the following we survey the literature on sensor networks with energy harvesting nodes. Early papers 
on energy harvesting in sensor networks are [15] and [24]. A practical solar energy harvesting sensor 
node prototype is described in [12]. A good recent contribution is [14]. It provides various deterministic 
theoretical models for energy generation and energy consumption profiles (based on (cr, p) traffic models 
in [8]) and provides conditions for energy neutral operation. In [11] a sensor node is considered which is 
sensing certain interesting events. The authors study optimal sleep- wake cycles such that event detection 
probability is maximized. This problem is also studied in [3]. A recent survey is [21] which also provides 
an optimal sleep-wake cycle for solar cells so as to obtain QoS for a sensor node. 
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In this paper we study a sensor node with an energy harvesting source. The motivating application is 
estimation of a random field which is one of the canonical applications of sensor networks. The above 
mentioned theoretical studies are motivated by other applications of sensor networks. In our application, 
the sensor nodes sense the random field periodically. After sensing, a node generates a packet (possibly 
after efficient compression). This packet needs to be transmitted to a central node, possibly via other 
sensor nodes. In an energy harvesting node, sometimes there may not be sufficient energy to transmit the 
generated packets (or even sense) at regular intervals and then the node may need to store the packets till 
they are transmitted. The energy generated can be stored (possibly in a finite storage) for later use. 

Initially we will assume that most of the energy is consumed in transmission only. We will relax this 
assumption later on. We find conditions for energy neutral operation of the system, i.e., when the system 
can work forever and the data queue is stable. We will obtain policies which can support maximum 
possible data rate. 

We also obtain energy management (power control) policies for transmission which minimize the mean 
delay of the packets in the queue. 

Our energy management policies can be used with sleep- wake cycles. Our policies can be used on 
a faster time scale during the wake period of a sleep-wake cycle. When the energy harvesting profile 
generates minimal energy (e.g., in solar cells) then one may schedule the sleep period. 

We have used the above energy mangement policies at a MAC (Multiple Access Channel) used by 
energy harvesting sensor nodes in [27]. 

We are currently investigating appropriate routing algorithms for a network of energy harvesting sensor 
nodes. 

The paper is organized as follows. Section |Il] describes the model and provides the assumptions made 
for data and energy generation. Section |lll] provides conditions for energy neutral operation. We obtain 
stable, power control policies which are throughput optimal. Section JV] obtains the power control policies 
which minimize the mean delay via Markov decision theory. A greedy policy is shown to be throughput 
optimal and provides minimum mean delays for linear transmission. Section |V] provides a throughput 
optimal policy when the energy consumed in sensing and processing is nonnegligible. A sensor node 
with a fading channel is also considered. Section |VI] provides simulation results to confirm our theoretical 
findings and compares various energy management policies. Section IVIH concludes the paper. The appendix 
provides proof of the lemma used in proving existence of an optimal policy. 
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II. Model and notation 



In this section we present our model for a single energy harvesting sensor node. 



Data Buffer 



nnin(q^,g(TJ) 




Energy Buffer 



X 



Fig. 1. The model 

We consider a sensor node (Fig. [T]) which is sensing a random field and generating packets to be 
transmitted to a central node via a network of sensor nodes. The system is slotted. During slot k (defined 
as time interval [k, k + 1], i.e., a slot is a unit of time) bits are generated by the sensor node. Although 
the sensor node may generate data as packets, we will allow arbitrary fragmentation of packets during 
transmission. Thus, packet boundaries are not important and we consider bit strings (or just fluid). The 
bits Xk are eligible for transmission in {k + l)st slot. The queue length (in bits) at time k is qk- The 
sensor node is able to transmit giT^) bits in slot k if it uses energy T^. We assume that transmission 
consumes most of the energy in a sensor node and ignore other causes of energy consumption (this is 
true for many low quality, low rate sensor nodes ([23])). This assumption will be removed in Section IVl 
We denote by the energy available in the node at time k. The sensor node is able to replenish energy 
by Yk in slot k. 

We will initially assume that {X^} and {Y^} are iid but will generalize this assumption later. It is 
important to generalize this assumption to capture realistic traffic streams and energy generation profiles. 
The processes {qk} and {Ek} satisfy 



{qk-9{Tk)y + Xk, 



(1) 



Ek+i 



(Ek - Tk) + Yk. 



(2) 



where Tk < Ek. This assumes that the data buffer and the energy storage buffer are infinite. If in practice 
these buffers are large enough, this is a good approximation. If not, even then these results provide 
important insights and the policies obtained often provide good performance for the finite buffer case. 
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The function g will be assumed to be monotonically non-decreasing. An important such function is 
given by Shannon's capacity formula 

giTk) = hogil + pn) 

for Gaussian channels where /? is a constant such that P is the SNR. This is a non-decreasing concave 
function. At low values of Tk, g{Tk) ~ /9i Tk, i.e., g becomes a linear function. Since sensor nodes are 
energy constrained, this is a practically important case. Thus in the following we limit our attention to 
linear and concave nondecreasing functions g. We will also assume that g{Q) = which always holds in 
practice. 

Many of our results (especially the stability results) will be valid when {Xk] and {Yk] are stationary, 
ergodic. These assumptions are general enough to cover most of the stochastic models developed for 
traffic (e.g., Markov modulated) and energy harvesting. 

Of course, in practice, statistics of the traffic and energy harvesting models will be time varying (e.g., 
solar cell energy harvesting will depend on the time of day). But often they can be approximated by 
piecewise stationary processes. For example, energy harvesting by solar cells could be taken as being 
stationary over one hour periods. Then our results could be used over these time periods. Often these 
periods are long enough for the system to attain (approximate) stationarity and for our results to remain 
meaningful. 

In Section Hn] we study the stability of this queue and identify easily implementable energy management 
policies which provide good performance. 

III. Stability 

We will obtain a necessary condition for stability. Then we present a transmission policy which achieves 
the necessary condition, i.e., the policy is throughput optimal. The mean delay for this policy is not 
minimal. Thus, we obtain other policies which provide lower mean delay. In the next section we will 
consider optimal policies. 

Let us assume that we have obtained an (asymptotically) stationary and ergodic transmission policy 
{Tk] which makes [qk] (asymptotically) stationary with the limiting distribution independent of go- Taking 
{Tk] asymptotically stationary seems to be a natural requirement to obtain (asymptotic) stationarity of 

[qk]- 
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Lemma 1 Let g be concave nondeceasing and {X^}, {1^} be stationary, ergodic sequences. For {T^} 
to be an asymptotically stationary, ergodic energy management policy that makes {g^} asymptotically 
stationary with a proper stationary distribution n it is necessary that i?[Xfe] < ET,[g(T)] < g{E[Y]). 

Proof: Let the system start with qq = Eq = 0. Then for each n, n^^ Xlfc=i ^fe — X]fc=i ^fc + ^• 
Thus, if n-^ Yl=i Tk E[T] a.s., then E[T] < E[Y]. Also then n'^ Yl=i diTk) ^ E[g{T)] a.s. 

Thus from results on G/G/1 queues [6], E[g(T)] > E[X] is needed for the (asymptotic) stationarity of 
{qk}- If g is linear then the above inequalities imply that for stationarity of {qk} we need 

E[X] < E[giT)] = giE[T]) 

< giE[Y]) = E[g{Y)]. (3) 

If g is concave, then we need 

E[X] < E[g{T)] < giE[T]) < giE[Y]). (4) 

Thus E[X] < g{E[Y]) is a necessary condition to get an (asymptotically) stationary sequence {g{Tk)} 
which provides an asymptotically stationary {qt}- ■ 

Let 

Tk = rmn{Ek, E[Y] - e) (5) 

where e is an appropriately chosen small constant (see statement of Theorem 1). We show that it is a 
throughput optimal policy, i.e., using this Tk with g satisfying the assumptions in Lemma 1, {qk} is 
asymptotically stationary and ergodic. 

Theorem 1 If {X^}, {Yk} are stationary, ergodic, g is continuous, nondecreasing, concave then if 
E[Xk] < g{E[Y]), © makes the queue stable (with e > such that E[X] < g{E[Y] - e)), i.e., it has 
a unique, stationary, ergodic distribution and starting from any initial distribution, qk converges in total 
variation to the stationary distribution. 

Proof: If we take Tk = min{Ek, E[Y] — e) for any arbitrarily small e > 0, then from Q, Ek y oo 
a.s. and Tk / — e. a.s. If g is continuous in a neighbourhood of E\y\ then by monotonicity of g 
we also get g{Tk) / g{E^\ - e) a.s. Hence E\g{Tk)\ / g{E^\ - e). We also get E[Tfe] / - e. 
Thus {(^(Tfc)} is asymptotically stationary and ergodic. Therefore, from G/G/1 queue results [6], [19] for 
Tk = min{Ek, E[Y] — e), E[X] < g(E[Y] — e) is a sufficient condition for {qk} to be asymptotically 
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Stationary and ergodic whenever {X^} is stationary and ergodic. The other conclusions also follow. Since 
g is non-decreasing and g{0) = 0, < g{E[Y]) implies that there is an e > such that E[X] < 

giE[Y]-e). m 

Henceforth we denote the policy (|5]) by TO. 

From results on GI/GI/1 queues ([2]), if {Xk} are iid, E[X] < g{E[Y]),Tk = min{Ek,E[Y] - e) and 
< oo for some a > 1 then the stationary solution {g^.} of ([T]) satisfies E[q°'^^] < oo. 

Taking = Y^ for all k will provide stability of the queue if E[X] < E[g(Y)]. If g is linear then this 
coincides with the necessary condition. If g is strictly concave then E[g{Y)] < g{E[Y]) unless Y = E[Y]. 
Thus Tfe = Yfc provides a strictly smaller stability region. We will be forced to use this policy if there is 
no buffer to store the energy harvested. This shows that storing energy allows us to have a larger stability 
region. We will see in Section |Vl] that storing energy can also provide lower mean delays. 

Although TO is a throughput optimal policy, if is small, we may be wasting some energy. Thus, it 
appears that this policy does not minimize mean delay. It is useful to look for policies which minimize 
mean delay. Based on our experience in [26], the Greedy policy 

Tk = min{Ek, f{qk)) (6) 

where / = g^^, looks promising. In Theorem 2, we will show that the stability condition for this policy 
is E[X] < E[g{Y)] which is optimal for linear g but strictly suboptimal for a strictly concave g. We will 
also show in Section JV] that when g is linear, Q is not only throughput optimal, it also minimizes long 
term mean delay. 

For concave g, we will show via simulations that Q provides less mean delay than TO at low load. 
However since its stability region is smaller than that of the TO policy, at E[X] close to E[g(Y)], the 
Greedy performance rapidly deteriorates. Thus it is worthwhile to look for some other good policy. Notice 
that the TO policy wastes energy if q^ < g(E[Y] — e). Thus we can improve upon it by saving the energy 
{E[Y] — e — g^^{qk)) and using it when the qk is greater than g{E\Y] — e). However for g a log function, 
using a large amount of energy t is also wasteful even when qk > g{t). Taking into account these facts 
we improve over the TO policy as 

Tk = minig-\qk), Ek, 0.99{E[Y] + 0.001(^fc - cqk)^)) (7) 



where c is a positive constant. The improvement over the TO also comes from the fact that if Ek is large. 
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we allow Tfc > E[Y] but only if is not very large. The constants 0.99 and 0.001 were chosen by trial 
and error from simulations after experimenting with different scenarios. We will see in Section |VI] via 
simulations that the policy, to be denoted by MTO can indeed provide lower mean delays than TO at 
loads above E[g{Y)]. 

One advantage of (|5]) over Q and (|7]) is that while using ([5]), after some time = E[Y] — e. Also, at 
any time, either one uses up all the energy or uses E\Y] — e. Thus one can use this policy even if exact 
information about E^ is not available (measuring E^ may be difficult in practice). In fact, ([5]) does not 
need even while ^ either uses up all the energy or uses f{qk) and hence needs only qk exactly. 

Now we show that under the greedy policy ^ the queueing process is stable when E\X] < E\g(Y)]. 
In next few results we assume that the energy buffer is finite, although large. For this case Lemma 1 and 
Theorem 1 also hold under the same assumptions with slight modifications in their proofs. 

Theorem 2 If the energy buffer is finite, i.e., Ek < e < oo (but e is large enough) and E[X] < E[g{Y)] 
then under the greedy policy {qk,Ek) has an Ergodic set. 

Proof: To prove that (qk, Ek) has an ergodic set [20], we use the Lyapunov function h{q, e) = q and 
show that this has a negative drift outside a large enough set of state space 



where /? > is appropriately chosen. If we take [3 large enough, because e < e, (g, e) E A will ensure 
that q is appropriately large. We will specify our requirements on this later. 

For (g, e) G A, M > fixed, since we are using greedy policy 



A = {(g,e) :g + e>/3} 



E[h{qk+M,Ek+M) - h{qk,Ek)\{qk,Ek) = (g,e)] 



E[{q - g{Tk) +Xk- g{Tk+i)) + Xk+i 



(8) 



giTk+M-i) +Xk+M-i -q\iqk,Ek) = (g,e)]. 



Because Tn < E^ < e, we can take f3 large enough such that the RHS of ^ equals 



k+M-l 



k+M-1 



E[q+ ^' 



n 



9{Tn) - 9{e) - q\{qk,Ek) = {q,e)]. 



n=k 



n=k+l 
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Thus to have ([8]) less than —62 for some 62 > 0, it is sufficient that 

ffc+M-l 



ME[X] < E 



n=k+l 



9{e). 



This can be ensured for any e because we can always take T„ > min{e, Yn-i) with probability > 1 — 5 (for 
any given 6 > 0) for n = k + 1, . . . , + if in addition we also have ME[X] < {M-l)E[g{Y)] and e 
is large enough. This can be ensured for a large enough M because E[X] < E[g{Y)]. ■ 

The above result will ensure that the Markov chain {{qk, E^)} is ergodic and hence has a unique 
stationary distribution if {{qk, Ek)} is irreducible. A sufficient condition for this is < P[Xk = 0] < 1 
and < P[Yk = 0] < 1 because then the state (0,0) can be reached from any state with a positive 
probability. In general, {{qk,Ek)} can have multiple ergodic sets. Then, depending on the initial state, 
{{qk,Ek)} will converge to one of the ergodic sets and the limiting distribution depends on the initial 
conditions. 



IV. Optimal Policies 



In this section we choose at time as a function of qk and E^ such that 



E 



E 

.k=0 



qk 



is minimized where < a < 1 is a suitable constant. The minimizing policy is called a-discount optimal. 
When a = 1, we minimize 



lim sup —E 



n-1 

Y^Qk 

.k=0 



This optimizing policy is called average cost optimal. By Little's law [2] an average cost optimal policy 
also minimizes mean delay. If for a given {qk,ek), the optimal policy does not depend on the past 
values, and is time invariant, it is called a stationary Markov policy. 

If {Xk} and {Yk} are Markov chains then these optimization problems are Markov Decision Problems 
(MDP). For simplicity, in the following we consider these problems when {X^} and {Y^} are iid. We 
obtain the existence of optimal a-discount and average cost stationary Markov policies. 

Theorem 3 If (7 is continuous and the energy buffer is finite, i.e., Cfe < e < 00 then there exists an 
optimal a-discounted Markov stationary policy. If in addition E[X] < g(E[Y]) and < 00, then 

there exists an average cost optimal stationary Markov policy. The optimal cost v does not depend on the 
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initial state. Also, then the optimal a-discount policies tend to an optimal average cost policy as a ^ 1. 
Furthermore, if Va{q, e) is the optimal a-discount cost for the initial state (g, e) then 



Proof: We use Prop. 2.1 in [29] to obtain the existence of an optimal a-discount stationary Markov 
policy. For this it is sufficient to verify the condition (W) in [29]. The actions possible in state (q^, E^) = 
(g, e) are < < e. This forms a compact subset of the action space. Also this mapping is upper and 
lower semicontinous. Under action t, the next state becomes ((g — g{t))^ + X^, e — t + Yk). When g is 
continuous, the mapping t {{q — g{t))'^ + X^, e — t + Yfc) is a.s. continuous and hence the transition 
probability is continuous under weak convergence topology. In fact it converges under the stronger topology 
of setwise convergence. Also, the cost (g, e) i-^ g is continuous. Thus condition (W) in [29] is satisfied. 
Not only we get existence of a-discount optimal policy, from [10], we also get f„(g, e) v{q,e) as 
n ^ oo where t'„(g, e) is n-step optimal a-discount cost. 

To get the existence of an average cost optimal stationary Markov policy, we use Theorem 3.8 in 
[29]. This requires satisfying condition (B) in [29] in addition to condition (W). Let Ja(5, (g,e)) be the 
a-discount cost under policy 6 with initial state (g, e). Also let 



For this we use the TO policy described in Section |lll We have shown that for this policy there is a 
unique stationary distribution and if i?[X^] < oo then E[q] < oo under stationarity. 

Next we use the facts that Va{q, e) is non decreasing in g and non increasing in e. We will prove these 
at the end of this proof. Then = fa(0, e). 

Let r be the first time qk = 0, Ek = e when we use the TO policy. Under our conditions E[t] < oo if 
go = for any cq = e. Also, then 



lim(l - a) inf(<y^e)f„(g, e) = v 



rua = mf(^q^e)Va{q,e). 



Then we need to show that 



supa<i{va{q, e) - rria) < oo 



(9) 



for all (g, e). 
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Thus, 



Va{q,e) - Va{0,e) < E 



T-l 



^Qk\qo = q,eo = 



k=0 



For notational convenience in the following inequalities we omit writing conditioning on go = <1,go = e. 
The RHS 

"t-1 k 



< E 



< E 



.k=0 1=0 

T-l T-l 

fc=0 /=o 

T-l 



qE[T] + E 



. 1=0 



< qE[T] + E[t^]^ E 



T-l 



Since [X^] are iid and r is a stopping time, E [(Y^JIq Xi)"^] < oo if -E'[r^] < oo and E[X^] < oo ([9]). 
In Lemma 2 in the Appendix we will show that E[t'^] < oo for any initial condition for the TO policy 
when E[X^] < oo. 
Thus we obtain 

Sup(^o<a<i)Va{q, e) - Va{0, e) < oo 

for each (g, e). This proves (0). 

Now we show that Va{q, e) is non-decreasing in q and non-increasing in e. Let f„ be n-step a-discount 
optimal cost where vq = c, a constant. Then f„ satisfies 



Vn+iiq, e) = mint{q + aE[vniiq - fi-W)^ + X,e-t + Y)]}. 



(10) 



To prove our assertion, we use induction. vo{q, e) satisfies the required properties. Let fn(g, e) also does. 
Then from (flOl) it is easy to show that Vn+iiq,e) also satisfies these monotonicity properties. We have 
shown above that 

Va{q,e) = lim Vn (g, e). 

n^oo 

Thus WQ(g, e) inherits these properties. ■ 
In Section UlI] we identified a throughput optimal policy when g is nondecreasing, concave. Theorem 
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3 guarantees the existence of an optimal mean delay policy. It is of interest to identify one such policy 
also. In general one can compute an optimal policy numerically via Value Iteration or Policy Iteration 
but that can be computationally intensive (especially for large data and energy buffer sizes). Also it does 
not provide any insight and requires traffic and energy profile statistics. In Section UlI] we also provided 
a greedy policy Q which is very intuitive, and is throughput optimal for linear g. However for concave 
g (including the cost function ^log(l + 7t)) it is not throughput optimal and provides low mean delays 
only for low load. Next we show that it provides minimum mean delay for linear g. 

Theorem 4 The Greedy policy Q is a-discount optimal for < a < 1 when g(t) = •yt for some 
7 > 0. It is also average cost optimal. 

Proof: We first prove the optimality for < a < 1 where the cost function is 



for a policy 5. Let there be an optimal policy that violates ^ at some time k, i.e., tk ^ min Ek). 
Clearly < E^. Also taking > Qk/l wastes energy and hence cannot be optimal. The only possibility 
for an optimal policy to violate ^ is when < qujl and gA,./7 < -Efc. This is done with the hope that 
using the extra energy ik—tk (where ik = Qk/l) later can possibly reduce the cost. However this increases 
the total cost by at least 



on that sample path. Thus such a policy can be improved by taking tk = tk- This holds for any a 
with < a < 1. Also, from Theorem 3, under the conditions given there, an a-discount optimal policy 
converges to an average cost optimal policy as a y 1. This shows that © is also average cost optimal. 

■ 

The fact that Greedy is a-discount optimal as well as average cost optimal implies that it is good not 
only for long term average delay but also for transient mean delays. 



In this section we consider two generalizations. First we will extend the results to the case of fading 
channels and then to the case where the sensing and the processing energy at a sensor node are non- 
negligible with respect to the transmission energy. 




ya\ik - tk) - ia^^\ik - tk) = ia\tk - tk){l - a) > 



V. Generalizations 
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In case of fading channels, we assume flat fading during a slot. In slot k the channel gain is hj.. The 
sequence {h^} is assumed stationary, ergodic, independent of the traffic sequence {X^} and the energy 
generation sequence {Yk}. Then if energy is spent in transmission in slot k, the {g^} process evolves 
as 



If the channel state information (CSI) is not known to the sensor node, then will depend only on {q^, E^). 
One can then consider the policies used above. For example we could use = min(Ek, E[Y] — e). Then 
the data queue is stable if E[X] < E[g{h{E[Y] — e))]. We will call this policy unfaded TO. If we use 
Greedy then the data queue is stable if E[X] < E[g{hY)]. 

If CSI hk is available to the node at time k, then the following are the throughput optimal policies. If 
g is linear, then g{x) = Px for some /5 > 0. Then, ifO<h<h<oo and P(h = h) > 0, the optimal 
policy is: T(h) = {E[Y] — e)/p{h = h) and T(h) = otherwise. Thus if h can take an arbitrarily large 
value with positive probability, then E[hT{h)] = oo at the optimal solution. 

If g{x) = ^log{l + I3x), then the water filling (WF) policy 



with the average power constraint E\Tk\ = E[Y] — e, is throughput optimal because it maximizes 
^Eh[log{l + PhT{h))] with the given constraints. 

Both of the above policies can be improved as before, by not wasting energy when there is not enough 
data. As in ^ in Section Hill we can further improve WF by taking 



We will call it MWF. These policies will not minimize mean delay. For that, we can use the MDP 
framework used in Section |IV] and numerically compute the optimal policies. 

Till now we assumed that all the energy that a node consumes is for transmission. However, sensing, 
processing and receiving (from other nodes) also require significant energy, especially in more recent 
higher end sensor nodes ([23]). Since we have been considering a single node so far, we will now include 
the energy consumed by sensing and processing only. For simplicity, we will assume that the node is always 
in one energy mode (e.g., lower energy modes [28] available for sensor nodes will not be considered). If 




(11) 




(12) 
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a sensor node with an energy harvesting system can be operated in energy neutral operation in normal 
mode itself (i.e., it satisfies the conditions in Lemma 1), then there is no need to have lower energy modes. 
Otherwise one has to resort to energy saving modes. 

We will assume that is the energy consumed by the node for sensing and processing in slot k. 
Unlike (which can vary according to g/,), {2^fc} can be considered a stationary ergodic sequence. The 
rest of the system is as in Section [III Now we briefly describe a energy management policy which is an 
extension of the TO policy in Section |llll This can provide an energy neutral operation in the present 
case. Improved/optimal policies can be obtained for this system also but will not be discussed due to lack 
of space. 

Let c be the minimum positive constant such that E\X\ < g{c). Then if c + E[Z] < E\Y] — 5, (where 
5 is a small positive constant) the system can be operated in energy neutral operation: If we take = c 
(which can be done with high probability for all k large enough), the process {g^} will have a unique 
stationary, ergodic distribution and there will always be energy Zk for sensing and processing for all k 
large enough. The result holds if {(X^, Y^, Z^)} is an ergodic stationary sequence. The arguments to show 
this are similar to those in Section [ni] and are omitted. 

When the channel has fading, we need E[X] < E[g{ch)] in the above paragraph. 

VL Simulations 

In this section, we compare the different policies we have studied via simulations. The g function is 
taken as linear {g{x) = lOx) or as g{x) = log{l + x) . The sequences {X^} and {1^} are iid. (We have 
also done limited simulations when {Xk} and {Yk} are Autoregressive and found that conclusions drawn 
in this section continue to hold). We consider the cases when X and Y can have exponential, uniform, 
Erlang or Hyperexponential distributions. The policies considered are: Greedy, TO, = Y^, MTO (with 
c = 0.1) and the mean delay optimal. At the end, we will also consider channels with fading. For the 
linear g, we already know that the Greedy policy is throughput optimal as well as mean delay optimal. 

The mean queue lengths for the different cases are plotted in Figs. l2lfT0l 

In Fig. [21 we compare Greedy, TO and mean-delay optimal (OP) policies for nonlinear g. The OP was 
computed via Policy Iteration. For numerical computations, all quantities need to be finite. So we took 
data and energy buffer sizes to be 50 and used quantized versions of qk and Ek. The distribution of X 
and Y is Poisson truncated at 5. These changes were made only for this example. Now g(E[Y]) = 1 and 
E[g{Y)] = 0.92. We see that the mean queue length of the three policies are negligible till E[X] = 0.8. 
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After that, the mean queue length of the Greedy policy rapidly increases while performances of the other 
two policies are comparable till 1 (although from E[X] = 0.6 till close to 1, mean queue length of TO 
is approximately double of OP). At low loads, Greedy has less mean queue length than TO. 

Fig. [3] considers the case when X and Y are exponential and g is linear. Now E[Y] = 1 and g{E[Y]) = 
E[g{Y)] = 10. Now all the policies considered are throughput optimal but their delay performances dijfer. 
We observe that the policy = Y^ (henceforth called unbuffered) has the worst performance. Next is 
the TO. 

Fig.Hlplots the case when g is linear and X and Y are uniformly distributed. E[Y] = 1 and g(E[Y]) = 
E[g(Y)] = 10. Although the comparative performance of the four policies is as in Fig.[3l performances of 
the three policies are somewhat closer for this case. An interesting observation is that although the mean 
delay of the Greedy for exponential distribution is close to that of the uniform case, for the unbuffered 
and the TO policies, the mean delay of the exponential is much worse. 

Figs. [5] and [6] provide the above results for g nonlinear. When X and Y are exponential, the results are 
provided in Fig. \5\ and when they are Erlang (obtained by summing 5 exponentials), they are in Fig. [6l 
Now, as before = Y^ is the worst. The Greedy performs better than the other policies for low values 
of E[X]. But Greedy becomes unstable at E[g{Y)] (= 2.01 for Fig. [5] and = 2.32 for Fig. P while the 
throughput optimal policies become unstable at g{E[Y]) (= 2.40 for Fig. [5] and Fig. |6l). Now for higher 
values of the modified TO performs the best and is close to Greedy at low 

Figs. ITlfTOl provide results for fading channels. The fading process {hk} is iid taking values 0.1, 0.5, 1.0 
and 2.2 with probabilities 0.1, 0.3, 0.4 and 0.2 respectively. Figs. Ul M are for the linear g and Figs. |9l \T0\ 
are for the nonlinear g. The policies compared are unbuffered. Greedy, Unfaded TO Q and Fading TO 
(WF) (fTTI) . In Figs. |9] and [TOl we have also considered Modified Unfaded TO dV]) and Modified Fading 
TO (MWF) 

In Fig.lll X and Y are Erlang distributed. For this case, E[Y] = 1, E[g{hY)] = 10 and E[g{hE[Y])] = 
10. We see that the stability region of fading TO is E[X] < E[g(hY)] (= 22.0) while that of the other 
three algorithms is E[X] < 10. However, mean queue length of fading TO is also larger from the beginning 
till almost 10. This is because in fading TO, we transmit only when h = h = 2.2 which has a small 
probability (= 0.2) of occurence. 

In Fig. [8l X and Y have Hyperexponential distributions. The distribution of r.v. X is a mixture of 5 
exponential distributions with means E[X]/A.9,2E[X]/A.9,3E[X]/A.9,QE[X]/A.9 and 10E[X]/A.9 and 
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probabilities 0.1, 0.2, 0.2, 0.3 and 0.2 respectively. The distribution of Y is obtained in the same way. Now 
E[Y] = 1, E[g{hY)] = 10 and E[g{hE[Y])] = 10. We observe the same trends here as in Fig. |7] except 
that the mean queue lengths of the different algorithms vary much more in Fig. [8] when compared to Fig. 
U\ Also, except for Fading TO the mean queue lengths in Fig. [8] are much more than in Fig. U\ This is 
expected because the Hyperexponential distribution has much more variability than Erlang. 

Figs. |9] and \T0\ consider nonlinear g. In Fig. |9] X, F are Erlang distributed and in Fig. \T0\ X, Y are 
Hyperexponential as in Figs. H and [I In Fig. HI E[Y] = l,E[g{hY)] = 0.Q2, E[g{hE[Y])] = 0.64 while 
in FigEOl E[Y] = l,E[g{hY)] = 0.51 and E[g{hE[Y])] = 0.64. Now we see that the stability region of 
unbuffered and Greedy is the smallest, then of TO and MTO while WF and MWF provide the largest 
region and are stable for E[X] < 0.70. MTO and MWF provide improvements in mean queue lengths 
over TO and WF. The difference in stability regions is smaller for Erlang distribution. 

45 
40 

35 
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84 ' 0^5 0^6 ' 0.7 ^ o'.S 0^9 1 

EX 

Fig. 2. Mean Delay Optimal, Greedy, TO Policies with No Fading; Nonlinear g; Finite, Quantized data and energy buffers; X, Y: Poisson 
truncated at 5; E[Y] = l,E[g{Y)] = 0.92, g{E[Y]) = 1 





Fig. 3. Comparison of policies with No Fading; g{x) = 10a;; X, Y: Exponential; E[Y] = 1, E[g{Y)] = 10, g{E[Y]) = 10 
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2 4 6 8 10 

EX 



Fig. 4. Comparison of policies with No Fading; g{x) = 10a;; X, Y: Uniform; E[Y] = 1, E[g{Y)] = 10, g{E[Y]) = 10 




Fig. 5. Comparison of policies with No Fading; g{x) = log{l + x); X, Y: Exponential; E[Y] = 10, E[g{Y)] = 2.01, g{E[Y]) = 2.4 




0.5 1 1.5 2 

EX 



Fig. 6. Comparison of policies with No Fading; g{x) = log(l + x); X, Y: Erlang(5); E[Y] = 10, E[g{Y)] = 2.32, g{E[Y]) = 2.4 



VII. Conclusions 

We have considered a sensor node with an energy harvesting source, deployed for random field 
estimation. Throughput optimal and mean delay optimal energy management policies are identified which 
can make the system work in energy neutral operation. The mean delays of these policies are compared 
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EX 



Fig. 7. Comparison of policies with Fading; g{x) = 10a;; X, Y: Erlang(5); E[Y] = 1, E[g{Y)] = 10, g{ElY]) = 10 




EX 



Fig. 8. Comparison of policies with Fading; g{x) = Wx; X,Y: Hyperexponential(5); E[Y] = l,E[g{Y)] = 10, g{E[Y]) = 10 




□ 0,1 0,2 0,3 0,4 0.5 0,0 0,7 

EX 



Fig. 9. Comparison of policies with Fading; g{x) = log{l + x); X,Y: Erlang(5); E[Y] = 1, E[g{hY)] = 0.62, E[g{hE[Y])] = 0.64; 
WF Mod. WF stable for E[X] <0.70 



with other suboptimal policies via simulations. It is found that having energy storage allows larger stability 
region as well as lower mean delays. 

We have extended our results to fading channels and when energy at the sensor node is also consumed 
in sensing and data processing. Similarly we can include leakage/wastage of energy when it is stored in 
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0,1 0,2 0,3 0,4 0,6 0,6 0.7 



EX 



Fig. 10. Comparison of policies with Fading; g{x) = log{l+x); X, Y: Hyperexponential(5); ElY] = 1, Elg{hY)] = 0.51, E[g{hE[Y])] = 
0.64; WF Mod. WF stable for E[X] < 0.70 



the energy buffer and when it is extracted. Suitable MACs for such sensor nodes have also been studied 
in [27]. 
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IX. Appendix 



To avoid trivialities we assume P[Xk > 0] > 0. For the following lemma we also assume that P[Xk = 
0] > 0. 



Lemma 2 When {Xk}, {Yk} are iid, E[X] < giE[Y] - e),e < e, and E[X"] < oo for some a > 1 
then 



T^inf{k>l : {qk,Ek) = (0,e)} 



satisfies Elr""] < oo for any {qo,Eo) = (g, e). 
Proof: Let 

A = {{q,e) : q + e < p} 



where f3 is an appropriately defined positive, finite constant. We will first show that starting from any 
initial {qo, Eq) = (g, e) the first time f to reach A satisfies i?[r"] < oo. Next we will show that with a 
positive probability in a finite (bounded) number of steps (g^, E^) can reach from A to (0, e). Then by a 
standard coin tosing argument, we will obtain £'[r"] < oo. 
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To show E[t'^] < oo, we use a result in [[13], pp.116]. Then it is sufficient to show that for h{q, e) = g, 



sup(q^e)^AE [h{qi, El) - h{q, e)\qo = q, Eq = e] < -6 



(13) 



for some 6 > and 



E[\h{qi,Ei) - h{q,e)n{qo,Eo) = (g, e)] < oo 



(14) 



for all (g, e). 

Instead of using (fT3] ). (IT4l) on the Markov chain {{qt, E^)} we use it on the Markov chain 
{{qMk, EMk),k > 0} where M > is an appropriately large postive integer. Thus for (fT4l) we have 
to show that 

E[\qM - q\"\qo = q]<oo 

which holds if E[X'^] < oo. 

Next we show ([T3l) . Taking /3 large enough, since Tk < e, we get for (g, e) ^ A, 



Thus, (fT3]) is satisfied if 



But for TO, 



E [/;,(gA/, ^m) - h{qo, Eq) \ (go, ^o) = (g, e)] 



E 



M 



<l + ^{Xn- 9{Tn)) -g|(go,^o) = (g,e) 



n=0 



1 ^ 



fc=i 



^f^E[<7(T„)|(go,Eo) = (g,e)] 
fc=i 

1 

-5^i?[(7(T„,)|i?o = e]^(7(E[r]-e) 



k=l 



(15) 



and thus there is an M (choosing one corresponding to e = will be sufficient for other e) such that if 
E[X] < g{E[Y] - e), then ([JS]) will be satisfied for some 6 > 0. 

Now we show that from any point (g, e) G A, the process can reach the state (0, e) with a positive prob- 
ability in a finite number of steps. Choose positive ei, £2, £3, £4 such that P[Xk = 0] = ei > and P[Yk > 
£3] > £4, 9(^3) = ^2, where such positive constants exist under our assumptions. Then with probability 
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> (eie4 



)i['2hUs]),{qk,Ek) reaches (0, e) 



m 





+ 


e 


£2 




_<:3_ 



Steps where [a:;] denotes the smallest integer > 



X. 
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